Overview

Dataset statistics

Number of variables10
Number of observations752
Missing cells388
Missing cells (%)5.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory58.9 KiB
Average record size in memory80.2 B

Variable types

NUM9
BOOL1

Warnings

BloodPressure has 28 (3.7%) missing values Missing
Insulin has 360 (47.9%) missing values Missing
df_index has unique values Unique
Pregnancies has 108 (14.4%) zeros Zeros

Reproduction

Analysis started2020-09-22 14:44:30.342362
Analysis finished2020-09-22 14:45:00.316203
Duration29.97 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct752
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean385.0146277
Minimum0
Maximum767
Zeros1
Zeros (%)0.1%
Memory size5.9 KiB
2020-09-22T18:45:00.712510image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile38.55
Q1194.75
median385.5
Q3577.25
95-th percentile729.45
Maximum767
Range767
Interquartile range (IQR)382.5

Descriptive statistics

Standard deviation221.5469605
Coefficient of variation (CV)0.575424788
Kurtosis-1.198922796
Mean385.0146277
Median Absolute Deviation (MAD)191.5
Skewness-0.003831991044
Sum289531
Variance49083.05571
MonotocityStrictly increasing
2020-09-22T18:45:01.027424image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
76710.1%
 
25310.1%
 
26210.1%
 
26110.1%
 
26010.1%
 
25910.1%
 
25810.1%
 
25710.1%
 
25610.1%
 
25510.1%
 
Other values (742)74298.7%
 
ValueCountFrequency (%) 
010.1%
 
110.1%
 
210.1%
 
310.1%
 
410.1%
 
ValueCountFrequency (%) 
76710.1%
 
76610.1%
 
76510.1%
 
76410.1%
 
76310.1%
 

Pregnancies
Real number (ℝ≥0)

ZEROS

Distinct17
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.85106383
Minimum0
Maximum17
Zeros108
Zeros (%)14.4%
Memory size5.9 KiB
2020-09-22T18:45:01.502106image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q36
95-th percentile10
Maximum17
Range17
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.375189327
Coefficient of variation (CV)0.8764303778
Kurtosis0.1705680019
Mean3.85106383
Median Absolute Deviation (MAD)2
Skewness0.9072902384
Sum2896
Variance11.39190299
MonotocityNot monotonic
2020-09-22T18:45:01.713977image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%) 
113217.6%
 
010814.4%
 
210113.4%
 
3749.8%
 
4689.0%
 
5557.3%
 
6486.4%
 
7445.9%
 
8374.9%
 
9283.7%
 
Other values (7)577.6%
 
ValueCountFrequency (%) 
010814.4%
 
113217.6%
 
210113.4%
 
3749.8%
 
4689.0%
 
ValueCountFrequency (%) 
1710.1%
 
1510.1%
 
1420.3%
 
13101.3%
 
1291.2%
 

Glucose
Real number (ℝ≥0)

Distinct135
Distinct (%)18.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean121.9414894
Minimum44
Maximum199
Zeros0
Zeros (%)0.0%
Memory size5.9 KiB
2020-09-22T18:45:01.972287image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum44
5-th percentile80
Q199.75
median117
Q3141
95-th percentile181
Maximum199
Range155
Interquartile range (IQR)41.25

Descriptive statistics

Standard deviation30.60119806
Coefficient of variation (CV)0.2509498467
Kurtosis-0.2954030665
Mean121.9414894
Median Absolute Deviation (MAD)20
Skewness0.5226732772
Sum91700
Variance936.4333229
MonotocityNot monotonic
2020-09-22T18:45:02.261624image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
100172.3%
 
99172.3%
 
129141.9%
 
111141.9%
 
106141.9%
 
112131.7%
 
108131.7%
 
95131.7%
 
125131.7%
 
109121.6%
 
Other values (125)61281.4%
 
ValueCountFrequency (%) 
4410.1%
 
5610.1%
 
5720.3%
 
6110.1%
 
6210.1%
 
ValueCountFrequency (%) 
19910.1%
 
19810.1%
 
19740.5%
 
19630.4%
 
19520.3%
 

BloodPressure
Real number (ℝ≥0)

MISSING

Distinct46
Distinct (%)6.4%
Missing28
Missing (%)3.7%
Infinite0
Infinite (%)0.0%
Mean72.40055249
Minimum24
Maximum122
Zeros0
Zeros (%)0.0%
Memory size5.9 KiB
2020-09-22T18:45:02.577152image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum24
5-th percentile52
Q164
median72
Q380
95-th percentile91.7
Maximum122
Range98
Interquartile range (IQR)16

Descriptive statistics

Standard deviation12.37987032
Coefficient of variation (CV)0.1709913792
Kurtosis0.9228827146
Mean72.40055249
Median Absolute Deviation (MAD)8
Skewness0.1376292303
Sum52418
Variance153.2611892
MonotocityNot monotonic
2020-09-22T18:45:02.866948image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%) 
70577.6%
 
74516.8%
 
78456.0%
 
72445.9%
 
68435.7%
 
64425.6%
 
80395.2%
 
76395.2%
 
60374.9%
 
62344.5%
 
Other values (36)29339.0%
 
ValueCountFrequency (%) 
2410.1%
 
3020.3%
 
3810.1%
 
4010.1%
 
4440.5%
 
ValueCountFrequency (%) 
12210.1%
 
11410.1%
 
11030.4%
 
10820.3%
 
10630.4%
 

SkinThickness
Real number (ℝ≥0)

Distinct51
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.17228464
Minimum7
Maximum99
Zeros0
Zeros (%)0.0%
Memory size5.9 KiB
2020-09-22T18:45:03.117467image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile14
Q125
median29.17228464
Q332
95-th percentile44
Maximum99
Range92
Interquartile range (IQR)7

Descriptive statistics

Standard deviation8.852102582
Coefficient of variation (CV)0.303442212
Kurtosis5.344832133
Mean29.17228464
Median Absolute Deviation (MAD)3.827715356
Skewness0.8162601036
Sum21937.55805
Variance78.35972012
MonotocityNot monotonic
2020-09-22T18:45:03.390509image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
29.1722846421829.0%
 
32304.0%
 
30273.6%
 
27233.1%
 
23202.7%
 
33202.7%
 
28202.7%
 
18202.7%
 
31192.5%
 
19182.4%
 
Other values (41)33744.8%
 
ValueCountFrequency (%) 
720.3%
 
820.3%
 
1050.7%
 
1160.8%
 
1270.9%
 
ValueCountFrequency (%) 
9910.1%
 
6310.1%
 
6010.1%
 
5610.1%
 
5420.3%
 

Insulin
Real number (ℝ≥0)

MISSING

Distinct184
Distinct (%)46.9%
Missing360
Missing (%)47.9%
Infinite0
Infinite (%)0.0%
Mean156.0561224
Minimum14
Maximum846
Zeros0
Zeros (%)0.0%
Memory size5.9 KiB
2020-09-22T18:45:03.716126image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile42.55
Q176.75
median125.5
Q3190
95-th percentile396.5
Maximum846
Range832
Interquartile range (IQR)113.25

Descriptive statistics

Standard deviation118.8416898
Coefficient of variation (CV)0.7615317355
Kurtosis6.356505089
Mean156.0561224
Median Absolute Deviation (MAD)54.5
Skewness2.165116186
Sum61174
Variance14123.34723
MonotocityNot monotonic
2020-09-22T18:45:04.075240image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
105111.5%
 
14091.2%
 
13091.2%
 
12081.1%
 
9470.9%
 
10070.9%
 
18070.9%
 
11060.8%
 
13560.8%
 
11560.8%
 
Other values (174)31642.0%
 
(Missing)36047.9%
 
ValueCountFrequency (%) 
1410.1%
 
1510.1%
 
1610.1%
 
1820.3%
 
2210.1%
 
ValueCountFrequency (%) 
84610.1%
 
74410.1%
 
68010.1%
 
60010.1%
 
57910.1%
 

BMI
Real number (ℝ≥0)

Distinct246
Distinct (%)32.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.45465426
Minimum18.2
Maximum67.1
Zeros0
Zeros (%)0.0%
Memory size5.9 KiB
2020-09-22T18:45:04.558957image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum18.2
5-th percentile22.2
Q127.5
median32.3
Q336.6
95-th percentile44.5
Maximum67.1
Range48.9
Interquartile range (IQR)9.1

Descriptive statistics

Standard deviation6.928926198
Coefficient of variation (CV)0.2134956097
Kurtosis0.8748420787
Mean32.45465426
Median Absolute Deviation (MAD)4.6
Skewness0.5968157768
Sum24405.9
Variance48.01001826
MonotocityNot monotonic
2020-09-22T18:45:04.849335image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
32121.6%
 
31.2121.6%
 
31.6121.6%
 
32.4101.3%
 
33.3101.3%
 
30.891.2%
 
32.891.2%
 
32.991.2%
 
30.191.2%
 
34.281.1%
 
Other values (236)65286.7%
 
ValueCountFrequency (%) 
18.230.4%
 
18.410.1%
 
19.110.1%
 
19.310.1%
 
19.410.1%
 
ValueCountFrequency (%) 
67.110.1%
 
59.410.1%
 
57.310.1%
 
5510.1%
 
53.210.1%
 

DiabetesPedigreeFunction
Real number (ℝ≥0)

Distinct511
Distinct (%)68.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4730505319
Minimum0.078
Maximum2.42
Zeros0
Zeros (%)0.0%
Memory size5.9 KiB
2020-09-22T18:45:05.143962image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0.078
5-th percentile0.141
Q10.244
median0.377
Q30.6275
95-th percentile1.13105
Maximum2.42
Range2.342
Interquartile range (IQR)0.3835

Descriptive statistics

Standard deviation0.3301080525
Coefficient of variation (CV)0.6978283085
Kurtosis5.592024879
Mean0.4730505319
Median Absolute Deviation (MAD)0.17
Skewness1.9040609
Sum355.734
Variance0.1089713263
MonotocityNot monotonic
2020-09-22T18:45:05.425121image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.25860.8%
 
0.25460.8%
 
0.23850.7%
 
0.20750.7%
 
0.25950.7%
 
0.26850.7%
 
0.69240.5%
 
0.26340.5%
 
0.19740.5%
 
0.55140.5%
 
Other values (501)70493.6%
 
ValueCountFrequency (%) 
0.07810.1%
 
0.08410.1%
 
0.08520.3%
 
0.08820.3%
 
0.08910.1%
 
ValueCountFrequency (%) 
2.4210.1%
 
2.32910.1%
 
2.28810.1%
 
2.13710.1%
 
1.89310.1%
 

Age
Real number (ℝ≥0)

Distinct52
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.3125
Minimum21
Maximum81
Zeros0
Zeros (%)0.0%
Memory size5.9 KiB
2020-09-22T18:45:05.888839image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile21
Q124
median29
Q341
95-th percentile58
Maximum81
Range60
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.70939523
Coefficient of variation (CV)0.3515015455
Kurtosis0.6166577132
Mean33.3125
Median Absolute Deviation (MAD)7
Skewness1.116815734
Sum25051
Variance137.1099368
MonotocityNot monotonic
2020-09-22T18:45:06.377368image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
22689.0%
 
21597.8%
 
25476.2%
 
24456.0%
 
23385.1%
 
28354.7%
 
26324.3%
 
27324.3%
 
29293.9%
 
31243.2%
 
Other values (42)34345.6%
 
ValueCountFrequency (%) 
21597.8%
 
22689.0%
 
23385.1%
 
24456.0%
 
25476.2%
 
ValueCountFrequency (%) 
8110.1%
 
7210.1%
 
7010.1%
 
6910.1%
 
6810.1%
 

Outcome
Boolean

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.9 KiB
0
488 
1
264 
ValueCountFrequency (%) 
048864.9%
 
126435.1%
 
2020-09-22T18:45:06.570283image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Interactions

2020-09-22T18:44:39.259897image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:39.593094image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:39.985013image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:40.248303image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:40.457656image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:40.656577image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:40.882006image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:41.068074image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:41.271444image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:41.463470image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:41.688309image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:41.885163image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:42.097767image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:42.265610image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:42.440513image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:42.684223image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:42.915273image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:43.161231image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:43.399095image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:43.718017image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:43.908445image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:44.090919image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:44.312828image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:44.501776image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:44.692782image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:44.947003image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:45.185448image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:45.451304image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:45.733083image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:45.918698image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:46.099026image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:46.274175image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:46.428461image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:46.591887image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:46.757439image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:46.935358image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:47.118729image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:47.312306image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:47.501357image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:47.702728image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:47.881250image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:48.064010image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:48.245671image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:48.405845image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:48.593848image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:48.813281image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:49.002961image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:49.255177image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:49.615269image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:49.776699image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:49.965020image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:50.140651image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:50.636013image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:50.964426image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:51.227010image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:51.449780image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:51.635567image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:51.800447image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:51.973055image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:52.165320image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:52.346645image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:52.538987image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:52.729848image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:52.972868image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:53.222377image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:53.486445image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:53.719983image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:53.930950image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:54.154407image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:54.675932image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:55.500894image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:56.150076image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:56.464205image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:56.788511image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:57.039778image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:57.402532image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:57.762167image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:58.070320image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:58.285924image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:58.496615image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:58.723679image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Correlations

2020-09-22T18:45:06.708257image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-09-22T18:45:06.987170image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-09-22T18:45:07.345376image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-09-22T18:45:07.734951image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-09-22T18:44:59.080329image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:59.608642image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:44:59.905934image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-09-22T18:45:00.076568image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Sample

First rows

df_indexPregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAgeOutcome
006148.072.035.000000NaN33.60.627501
11185.066.029.000000NaN26.60.351310
228183.064.029.172285NaN23.30.672321
33189.066.023.00000094.028.10.167210
440137.040.035.000000168.043.12.288331
555116.074.029.172285NaN25.60.201300
66378.050.032.00000088.031.00.248261
7710115.0NaN29.172285NaN35.30.134290
882197.070.045.000000543.030.50.158531
9104110.092.029.172285NaN37.60.191300

Last rows

df_indexPregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAgeOutcome
7427581106.076.029.172285NaN37.50.197260
7437596190.092.029.172285NaN35.50.278661
744760288.058.026.00000016.028.40.766220
7457619170.074.031.000000NaN44.00.403431
746762989.062.029.172285NaN22.50.142330
74776310101.076.048.000000180.032.90.171630
7487642122.070.027.000000NaN36.80.340270
7497655121.072.023.000000112.026.20.245300
7507661126.060.029.172285NaN30.10.349471
751767193.070.031.000000NaN30.40.315230